
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "cf723ed9-d8e0-4f1a-911f-887b927f8569",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0] # 哆啦A梦与超级赛亚人：时空之战\n",
      "\n",
      "[1] 在一个寻常的午后，大雄依旧坐在书桌前发呆，作业堆得像山，连第一页都没动。哆啦A梦在一旁翻着漫画，时不时叹口气，觉得这孩子还是一如既往的不靠谱。正当他们的生活照常进行时，一道强光突然从天而降，整个房间震动不已。光芒中走出一名金发少年，身披战甲、气势惊人，他就是来自未来的超级赛亚人——特兰克斯。他一出现便说出了惊人的话：未来的地球即将被黑暗势力摧毁，他来此是为了寻求哆啦A梦的帮助。\n",
      "\n",
      "[2] 哆啦A梦与大雄听后大惊，但也从特兰克斯坚定的眼神中读出了不容拒绝的决心。特兰克斯解释说，未来的敌人并非普通反派，而是一个名叫“黑暗赛亚人”的存在，他由邪恶科学家复制了贝吉塔的基因并加以改造，实力超乎想象。这个敌人不仅拥有赛亚人战斗力，还能操纵扭曲的时间能量，几乎无人可敌。特兰克斯已经独自战斗多年，但每一次都以惨败告终。他说：“科技，是我那个时代唯一缺失的武器，而你们，正好拥有它。”\n",
      "\n",
      "[3] 于是，哆啦A梦带着特兰克斯与大雄启动时光机，穿越到了那个即将崩溃的未来世界。眼前的景象令人震撼：城市沦为废墟，大地裂痕纵横，天空中浮动着压抑的黑雾。特兰克斯说，这正是黑暗赛亚人带来的结果，一切生命几乎都被抹杀，只剩他在苦苦支撑。大雄虽感到恐惧，但看到无辜的人类遭殃，内心逐渐燃起斗志。哆啦A梦则冷静地分析局势，决定使用他最强的三样秘密道具来对抗黑暗势力。\n",
      "\n",
      "[4] 三件秘密道具分别是：可以临时赋予超级战力的“复制斗篷”，能暂停时间五秒的“时间停止手表”，以及可在一分钟中完成一年修行的“精神与时光屋便携版”。大雄被推进精神屋内，在其中接受密集的训练，虽然只有几分钟现实时间，他却经历了整整一年的苦修。刚开始他依旧软弱，想放弃、想逃跑，但当他想起静香、父母，还有哆啦A梦那坚定的眼神时，他终于咬牙坚持了下来。出来之后，他的身体与精神都焕然一新，眼神中多了一份成熟与自信。\n",
      "\n",
      "[5] 最终战在黑暗赛亚人的空中要塞前爆发，特兰克斯率先出击，释放全力与敌人正面对决。哆啦A梦则用任意门和道具支援，从各个方向制造混乱，尽量压制敌人的时空能力。但黑暗赛亚人太过强大，仅凭特兰克斯一人根本无法压制，更别说击败。就在特兰克斯即将被击倒之际，大雄披上复制斗篷、冲破恐惧从高空跃下。他的拳头燃烧着金色光焰，目标直指敌人心脏。\n",
      "\n",
      "[6] 时间停止装置在关键时刻启动，世界陷入静止，大雄用这个短短五秒接近了敌人的盲点。他集中全力，一记重拳击穿了黑暗赛亚人的能量核心，引发巨大的能量反冲。黑暗赛亚人尖叫着化为碎光，天空中的黑雾瞬间散去，阳光重新洒落大地。特兰克斯倒在地上，看着眼前这个曾经懦弱的少年，露出了欣慰的笑容。他知道，这一次，是大雄救了世界。\n",
      "\n",
      "[7] 战后，未来世界开始恢复，植物重新生长，人类重建家园。特兰克斯告别时紧紧握住大雄的手，说：“你是我见过最特别的战士。”哆啦A梦也为大雄感到骄傲，说他终于真正成长了一次。三人站在山丘上，看着远方重新明亮的地平线，心中感受到从未有过的安宁。随后，哆啦A梦与大雄乘坐时光机返回了属于他们的那个年代，一切仿佛又恢复平静。\n",
      "\n",
      "[8] 回到现代后，大雄仿佛变了一个人，不再轻易抱怨、不再逃避责任。他认真写完作业，帮妈妈买菜，甚至主动练习体育，哆啦A梦惊讶得说不出话来。他知道，这不是一时兴起，而是大雄真正内心成长的结果。大雄有时会望着天空出神，仿佛还能看见未来世界的那一片废墟与重生的希望。他不会说出来，但他心中永远铭记那一战。\n",
      "\n",
      "[9] 几天后，电视新闻中突然出现一则画面：一位金发少年在街头击退了失控的机器人，引发市民围观与猜测。大雄放下手中的课本，望向哆啦A梦，两人心照不宣地笑了。也许，特兰克斯又回来了，也许，新的敌人正在逼近。冒险从未真正结束，而他们，早已准备好了。无论时空如何动荡，他们将永远并肩作战。\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from typing import List\n",
    "\n",
    "def split_into_chunks(doc_file: str) -> List[str]:\n",
    "    with open(doc_file, 'r') as file:\n",
    "        content = file.read()\n",
    "\n",
    "    return [chunk for chunk in content.split(\"\\n\\n\")]\n",
    "\n",
    "chunks = split_into_chunks(\"doc.md\")\n",
    "\n",
    "for i, chunk in enumerate(chunks):\n",
    "    print(f\"[{i}] {chunk}\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "cfe9bf60-5d21-4696-99a5-7e7f3b94dd06",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "768\n",
      "[0.026805490255355835, 0.008382031694054604, 0.0003433557285461575, 0.007298996206372976, 0.054333169013261795, -0.05325590819120407, 0.0013655555667355657, -0.001318215625360608, -0.03671124577522278, 0.07188179343938828, -0.007270680740475655, -0.00705300085246563, 0.042532842606306076, -0.03675277531147003, -0.054750584065914154, -0.00959857553243637, 0.017105547711253166, 0.05915362015366554, -0.033350009471178055, 0.06237656623125076, -0.004888544324785471, -0.034539539366960526, -0.07407604157924652, 0.04422193020582199, 0.01051688939332962, -0.03707781061530113, -0.027029838413000107, 0.03830365091562271, 0.02128245122730732, -0.011811483651399612, -0.005408741533756256, 0.002659021643921733, -0.0232985969632864, 0.052990928292274475, 0.00514944875612855, 0.029624201357364655, -0.030809666961431503, -0.01785610243678093, 0.04244604706764221, -0.0076923067681491375, -0.010638090781867504, 0.03210863843560219, -0.06592466682195663, -0.012100917287170887, 0.006814612075686455, -0.0011549685150384903, -0.020827773958444595, 0.027529601007699966, -0.045469943434000015, 0.05177016928792, -0.05147423967719078, 0.256486177444458, 0.05031631141901016, 0.01756429858505726, -0.01156522799283266, -0.013389571569859982, 0.01608261838555336, 0.01845124363899231, -0.006780586205422878, 0.03105003945529461, 0.05057542026042938, -0.012529821135103703, 0.06348630040884018, -0.0484028086066246, 0.0009181135101243854, 0.021004796028137207, -0.010703880339860916, -0.060850419104099274, 0.006804743316024542, 0.05067181959748268, -0.024943452328443527, -0.012202016077935696, 0.018374742940068245, 0.032339856028556824, 0.04231107607483864, 0.00020395148021634668, -0.017155038192868233, -0.016037967056035995, -0.03760528191924095, -0.0436251275241375, -0.04164879024028778, 0.016573479399085045, 0.009964775294065475, -0.0601021945476532, -0.027282945811748505, 0.02003115601837635, 0.01041433960199356, -0.04751446098089218, -0.01861216500401497, -0.011965056881308556, -0.0626218393445015, -0.040164973586797714, -0.009429669007658958, 0.01911354809999466, 0.00968236941844225, 0.0019715253729373217, 0.044232532382011414, -0.0038625069428235292, 0.0002937931567430496, 0.035808999091386795, -0.03436819463968277, -0.0068693519569933414, -0.029506158083677292, 0.002121184952557087, -0.016589509323239326, -0.014159929007291794, -0.018725117668509483, 1.7337908502668142e-05, 0.0009454165119677782, 0.04507479816675186, 0.006473736371845007, 0.033986520022153854, -0.004506746772676706, 0.04930451512336731, -0.012377956882119179, 0.0285202469676733, 0.02815270982682705, -0.012006565928459167, 0.012232259847223759, 0.003402915084734559, -0.015172976069152355, -0.02224714681506157, -0.014169597998261452, 0.023108800873160362, 0.009890844114124775, 0.04284041374921799, -0.03076101653277874, -0.03398944437503815, 0.01835845224559307, 0.03800955042243004, 0.023491961881518364, 0.02149423398077488, 0.0005227993824519217, -0.09578022360801697, 0.020698778331279755, -0.012416579760611057, 9.564748324919492e-05, -0.004413343034684658, -0.06176118552684784, -0.0637013390660286, 0.04251592233777046, -0.019275745376944542, -0.021949509158730507, 0.009037391282618046, 0.015709156170487404, 0.03664666414260864, -0.01125259418040514, 0.030907969921827316, -0.00798176322132349, -0.020503822714090347, 0.01846618577837944, 0.039743512868881226, -0.025002015754580498, 0.05026368424296379, 0.007021986413747072, 0.008837787434458733, 0.053539276123046875, -0.022880319505929947, -0.06497209519147873, 0.04554792866110802, -0.01892913691699505, -0.00956946425139904, 0.01566331461071968, -0.019115228205919266, -0.03303704410791397, 0.009130401536822319, -0.014804661273956299, -0.0005449718446470797, 0.03782360255718231, -0.024828270077705383, -0.020784923806786537, 0.06556414812803268, 0.05158453807234764, 0.027920952066779137, 0.021408215165138245, 0.018681321293115616, -0.006588049232959747, 0.04769238084554672, 0.01102996151894331, 0.022862378507852554, -0.1066974475979805, 0.01798075996339321, 0.06006713956594467, -0.020247293636202812, -0.013238568790256977, -0.0037571738939732313, -0.03314714506268501, 0.034960370510816574, 0.002900635590776801, 0.005402435548603535, -0.01199850719422102, -0.002979094162583351, -0.01913699321448803, -0.01963256672024727, 0.042408112436532974, -0.009020981378853321, -0.007898871786892414, -0.02155088074505329, -0.03288109228014946, -0.03879505395889282, -0.004414170514792204, -0.013777372427284718, 0.00238968338817358, 0.00890575721859932, 0.01703587733209133, 0.01844414323568344, -0.0013177923392504454, -0.03948662057518959, -0.04127557948231697, -0.011266768909990788, 0.02037428691983223, -0.033709704875946045, 0.00020459499501157552, -0.009100595489144325, 0.025252535939216614, 0.03549852594733238, 0.03128746151924133, 0.014305435121059418, -0.06018892303109169, -0.014268000610172749, -0.02548174373805523, -0.02024473063647747, -0.02630832977592945, 0.031466078013181686, -0.001475064316764474, 0.002453514374792576, -0.0038783601485192776, -0.07658121734857559, 0.034462977200746536, -0.021065574139356613, -0.026005128398537636, -0.03663504868745804, -0.042249664664268494, 0.04329921305179596, 0.013586803339421749, 0.018750885501503944, 0.025812409818172455, 0.03914476931095123, -0.03560350090265274, -0.03973193094134331, -0.019383704289793968, 0.03881317377090454, -0.02867351844906807, -0.04575642943382263, -0.030043570324778557, -0.005910515319555998, 0.030279839411377907, -0.00714161666110158, 0.01975640095770359, -0.023295722901821136, -0.03364107012748718, 0.028904961422085762, -0.024780549108982086, 0.0008984438027255237, 0.06980864703655243, 0.02007182501256466, 0.029536647722125053, 0.011422580108046532, 0.04347624629735947, -0.0644560232758522, -0.014401750639081001, 0.014391290955245495, 0.056192561984062195, -0.02488662302494049, -0.04148560389876366, 0.037213314324617386, -0.06418780237436295, -0.0041652037762105465, 0.000365140731446445, -0.02971564792096615, -0.019475243985652924, 0.02969675324857235, -0.012616566382348537, -0.003055819310247898, -0.002162757096812129, -0.07429531961679459, 0.01226256787776947, -0.00017993251094594598, 0.012211695313453674, -0.02500842697918415, -0.018118921667337418, -0.05154561251401901, 0.012818480841815472, -0.04199875518679619, -0.01100165769457817, -0.03452016040682793, -0.04893346130847931, 0.026315132156014442, 0.04358907416462898, -0.025372913107275963, 0.013372860848903656, -0.015835821628570557, -0.018673541024327278, 0.0004607795272022486, 0.14620473980903625, -0.022349504753947258, 0.002066070679575205, 0.020974690094590187, 0.05730259045958519, 0.05779651179909706, -0.007015352603048086, -0.059490371495485306, -0.0562482587993145, 0.07196556031703949, 0.008784053847193718, 0.031273599714040756, -0.032829400151968, -0.0310618057847023, -0.038107648491859436, -0.08086158335208893, -0.022178204730153084, 0.01967569999396801, 0.06560677289962769, -0.013616569340229034, -0.04321097582578659, 3.2128355087479576e-05, 0.007617791648954153, 0.05968446657061577, 0.011839373037219048, -0.007775948848575354, -0.022673433646559715, 0.06027138978242874, 0.03070763126015663, 0.10095943510532379, -0.018273746594786644, 0.005221726838499308, -0.018574124202132225, 0.046662505716085434, -0.03984099626541138, 0.06332744657993317, 0.017829325050115585, -0.016721289604902267, 0.059466030448675156, 0.053592223674058914, 0.03375038877129555, 0.01619797758758068, -0.02781626395881176, -0.046106163412332535, 0.028852591291069984, 0.007923401892185211, -0.0038213978987187147, -0.024486063048243523, 0.016442352905869484, -0.014322012662887573, -0.019534612074494362, -0.022928424179553986, 0.0008579301065765321, -0.05842093378305435, -0.022136913612484932, 0.008564689196646214, 0.0034134970046579838, 0.06435392051935196, -0.028461050242185593, -0.018981749191880226, -0.0038737072609364986, -0.015333256684243679, 0.03334107622504234, 0.004719577729701996, 0.051634229719638824, 0.012878591194748878, 0.01577749289572239, -0.017106935381889343, -0.02635161764919758, -0.002479273360222578, 0.02390153706073761, 0.02081419713795185, -0.007952267304062843, 0.029625173658132553, -0.0962781012058258, 0.041612330824136734, 0.014889370650053024, 0.03456718474626541, 0.021369878202676773, -0.023457961156964302, -0.0010088534327223897, 0.021206438541412354, 0.020208729431033134, 0.05265500396490097, -0.014558150433003902, -0.007277468219399452, -0.020998140797019005, -0.013605262152850628, 0.0324535109102726, -0.059614818543195724, -0.03337843716144562, 0.020222216844558716, 0.01256558671593666, -0.035198792815208435, -0.007077848073095083, -0.028356317430734634, -0.08278723806142807, 0.013192414306104183, 0.01149093545973301, -0.010274944826960564, 0.11089944839477539, 0.007383067160844803, -0.02479495294392109, 0.07341889292001724, -0.03335028514266014, -0.023820802569389343, -0.0029024926479905844, 0.0020615081302821636, -0.005726287607103586, 0.024756139144301414, 0.056058917194604874, -0.1111818477511406, -0.02194671519100666, -0.016140107065439224, 0.04339052364230156, 0.003710277145728469, -0.03505486249923706, 0.03899581357836723, 0.011435078456997871, 0.020221702754497528, -0.026690276339650154, 0.004832264967262745, -0.01584966480731964, -0.05317365378141403, 0.08263998478651047, -0.02741048112511635, 0.0038067451678216457, 0.02108619548380375, 0.011895623989403248, 0.004174358211457729, -0.010561530478298664, -0.04180794209241867, -0.03417474403977394, -0.04522692412137985, 0.010197125375270844, -0.030837591737508774, -0.004010607488453388, -0.06798148900270462, -0.011550962924957275, 0.007941635325551033, -0.0156096825376153, 0.0025790645740926266, -0.015110055916011333, -0.008954651653766632, 0.02007046341896057, -0.035370122641325, -0.05616576597094536, -0.0023005264811217785, 0.024881329387426376, -0.008479471318423748, 0.0319472998380661, 0.04894421249628067, 0.021891994401812553, -0.03589446097612381, 0.03244777023792267, -0.0005915068904869258, 0.0043019684962928295, 0.04572705551981926, -0.04888239875435829, -0.05986073985695839, 0.06363467127084732, -0.0245661623775959, -0.007733826991170645, -0.0016328304773196578, 0.0020884680561721325, -0.041061822324991226, 0.06061314046382904, -0.021700803190469742, -0.061425067484378815, 0.028309103101491928, 0.04443180561065674, -0.020188895985484123, -0.003210731316357851, -0.006333149969577789, 0.05330390855669975, 0.038410335779190063, 0.02393074333667755, 0.07729264348745346, -0.007035867311060429, 0.010095811448991299, 0.003508197143673897, -0.043646976351737976, 0.019585993140935898, -0.027304796501994133, -0.03865276277065277, -0.00841835979372263, 0.01698395423591137, -0.08224363625049591, -0.003916552290320396, -0.03603890910744667, -0.0029175090603530407, -0.01797184906899929, -0.019280601292848587, 0.03086359053850174, 0.048931825906038284, -0.009585125371813774, -0.08360495418310165, -0.022593526169657707, -0.01238773949444294, -0.011543955653905869, -0.03786725550889969, -0.06550999730825424, 0.0351918488740921, 0.041023898869752884, -0.08397974073886871, -0.017963137477636337, 0.006989818066358566, -0.0484766885638237, 0.015127992257475853, -0.04108656197786331, -0.01268249936401844, -0.006762423552572727, -0.08201917260885239, -0.021286632865667343, 0.015313138253986835, 0.07352936267852783, -0.03893899545073509, -0.015363155864179134, 0.00020188458438497037, 0.03244832903146744, -0.025801507756114006, 0.012067316100001335, 0.023155802860856056, 0.05740969255566597, 0.03198220208287239, 0.001448781811632216, -0.007571096066385508, 0.0018173352582380176, -0.014613236300647259, -0.020077276974916458, -0.019916893914341927, 0.022061653435230255, -0.02093195915222168, -0.007911412976682186, -0.015219916589558125, 0.04887215048074722, -0.02920396998524666, -0.01738612912595272, 0.000506277137901634, 0.019982531666755676, 0.03453020751476288, 0.03601839765906334, -0.02289234846830368, -0.037869349122047424, 0.01251245103776455, -0.022280903533101082, -0.06235316023230553, 0.03337497264146805, 0.0249991063028574, -0.002981775440275669, -0.03126857802271843, 0.04454126954078674, -0.020727381110191345, -0.05043894052505493, 0.027192549780011177, 0.004596256650984287, 0.011618747375905514, -0.015578368678689003, -0.010279979556798935, 0.03370494768023491, 0.013912423513829708, -0.04848867654800415, -0.02585037797689438, 0.015299168415367603, 0.029496679082512856, 0.008933362551033497, 0.02277173288166523, 0.05251258611679077, -0.05872463807463646, -0.0163556095212698, 0.004389139823615551, -0.0049561262130737305, -0.00799805298447609, -0.017453569918870926, -0.05093000456690788, -0.035113364458084106, -0.04187609255313873, -0.029814552515745163, -0.03136188164353371, 0.01692931540310383, 0.036952003836631775, 0.016865497455000877, 0.014302453957498074, 0.0077886334620416164, 0.01614723540842533, 0.0008381580701097846, 0.03724340349435806, 0.02870720997452736, 0.031427621841430664, 0.021611150354146957, 0.021806828677654266, 0.04758160561323166, -0.02728993259370327, -0.022526947781443596, 0.02261170744895935, 0.02077341452240944, -0.0410599447786808, -0.0071960557252168655, -0.03604191541671753, -0.019560616463422775, 0.03793584927916527, 0.033026691526174545, -0.005511824041604996, 0.021567612886428833, -0.03302357345819473, 0.03061600960791111, -0.01310470886528492, -0.02195858396589756, 0.0036418845411390066, 0.02930263616144657, 0.03702780604362488, -0.02399143949151039, -0.0136862862855196, 0.001830277149565518, 0.019493259489536285, -0.03094925358891487, -0.02121249958872795, -0.016473684459924698, 0.011225885711610317, 0.03063664585351944, -0.006342798005789518, 0.027236931025981903, -0.021282311528921127, -0.014434577897191048, -0.012314273044466972, -0.00016168381262104958, -0.001879829098470509, 0.05209755897521973, -0.035325925797224045, -0.06617751717567444, 0.03259539604187012, -0.014060616493225098, 0.06100108101963997, 0.07049669325351715, -0.006168830208480358, 0.005293147172778845, -0.0500921905040741, -0.03336146101355553, -0.015566726215183735, 0.029270630329847336, -0.0133975213393569, -0.03303980082273483, 0.0014945140574127436, 0.0070680733770132065, -0.014381774701178074, 0.02100299671292305, 0.04071948677301407, -0.038102902472019196, 0.036459241062402725, -0.016191013157367706, 0.009750712662935257, 0.043142177164554596, 0.03573149815201759, 0.03532514348626137, -0.016262006014585495, -0.019360462203621864, -0.013566985726356506, -0.024645084515213966, 0.05361153557896614, -0.008533928543329239, 0.04902370646595955, 0.024370437487959862, 0.0439641997218132, -0.014774022623896599, -0.010161032900214195, -0.05356002226471901, 0.006330407224595547, -0.04033401235938072, 0.02220681495964527, -0.0013751694932579994, -0.039880506694316864, -0.06964975595474243, 0.00015010788047220558, -0.0005040470277890563, 0.06797171384096146, 0.03835141286253929, 0.07828047126531601, 0.022008778527379036, 0.005389328580349684, 0.01744518242776394, 0.00010456371819600463, 0.028055241331458092, 0.03183517977595329, 0.014127422124147415, -0.04564821720123291, -0.04990759864449501, -0.010870479978621006, 0.005909291096031666, 0.04774388670921326, 0.005949125159531832, -0.009914574213325977, -0.05552608519792557, -0.017545755952596664, -0.008860052563250065, -0.04103097319602966, -0.051645126193761826, 0.012700834311544895, 0.06362909078598022, -0.023315856233239174, 0.013669350184500217, 0.016287028789520264, 0.2864297032356262, -0.029474951326847076, 0.010040824301540852, -0.04473850131034851, 0.038879115134477615, 0.006894038058817387, 0.02239929884672165, -0.0007777800783514977, -0.015402164310216904, 0.00022744396119378507, 0.002521264599636197, 0.012183399870991707, 0.03795032575726509, -0.039392974227666855, 0.0036947918124496937, 0.008819268085062504, 0.01187850534915924, -0.0031687491573393345, 0.014764891937375069, 0.0264655202627182, 0.005468143615871668, -0.014688628725707531, 0.010408896021544933, -0.0008929266477935016, 0.031062856316566467, 0.0058838180266320705, 0.002263352507725358, 0.03287533298134804, -0.021614348515868187, 0.061867572367191315, 0.031227990984916687, -0.014700375497341156, 0.043684180825948715, -0.002161695621907711, 0.004687688313424587, -0.03353721275925636, -0.018313048407435417, 0.012886503711342812, -0.017576536163687706, 0.013265102170407772, 0.016103139147162437, 0.008023952133953571, -0.04410070925951004, -0.009516923688352108, 0.026512015610933304, -0.04530531167984009, 0.0014346969546750188, 0.0005016304203309119, -0.019016437232494354, -0.03256050869822502, -0.06451268494129181, 0.0252375528216362, -0.03076227940618992, -0.010278875939548016, -0.0256830845028162, 0.0009958111913874745, 0.0017538159154355526, -0.04831397533416748, -0.024519694969058037, 0.06815527379512787, -0.02471328154206276, 0.044039610773324966, 0.00262440531514585, -0.020195696502923965, 0.02507602795958519, 0.013134405016899109, 0.030321385711431503, 0.04385118931531906, -0.0184311643242836, -0.0738498643040657, -0.03507034853100777, -0.053929831832647324, -0.008397717028856277]\n"
     ]
    }
   ],
   "source": [
    "from sentence_transformers import SentenceTransformer\n",
    "\n",
    "embedding_model = SentenceTransformer(\"shibing624/text2vec-base-chinese\")\n",
    "\n",
    "def embed_chunk(chunk: str) -> List[float]:\n",
    "    embedding = embedding_model.encode(chunk, normalize_embeddings=True)\n",
    "    return embedding.tolist()\n",
    "\n",
    "\n",
    "embedding = embed_chunk(\"测试内容\")\n",
    "print(len(embedding))\n",
    "print(embedding)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "87f48192-d9f7-4270-ae08-e5e0300bbb32",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10\n",
      "[-0.01957528479397297, 0.0071845026686787605, 0.02306997776031494, -0.01243650447577238, 0.03920751437544823, -0.05374186486005783, 0.028527185320854187, -0.02104203589260578, -0.0017695821588858962, 0.04136233776807785, -0.025198299437761307, -0.05593806877732277, 0.07257923483848572, 0.021626561880111694, -0.004362831357866526, -0.00028653739718720317, 0.060211509466171265, 0.0262150876224041, -0.04922763258218765, 0.009307703003287315, 0.013933558948338032, -0.005938100162893534, -0.036834198981523514, 0.023301733657717705, 0.010850701481103897, 0.004264288581907749, 0.0037720329128205776, -0.024697503075003624, 0.0013592864852398634, 0.05580887943506241, 0.021838366985321045, 0.0460783988237381, -0.06695900857448578, 0.02910572849214077, 0.019366616383194923, -0.021051250398159027, 0.015360517427325249, -0.0030887385364621878, 0.010731684975326061, 0.02203541435301304, 0.03437020629644394, 0.04636266827583313, -0.05769671872258186, -0.059550102800130844, 0.001739225466735661, 0.055718936026096344, 0.0004280135908629745, 0.047776516526937485, -0.032583389431238174, 0.033878959715366364, -0.05590406060218811, 0.31629088521003723, 0.031006356701254845, -0.024298109114170074, -0.009092274121940136, 0.06546442210674286, 0.023344915360212326, 0.006341062020510435, 0.018706809729337692, 0.022990958765149117, 0.013833963312208652, -0.013355066999793053, -0.016584236174821854, -0.02496802993118763, -0.014794430695474148, -0.013685967773199081, -0.024329252541065216, -0.03959378972649574, -0.04097596928477287, -0.04594769328832626, 0.02948920615017414, 0.03400905430316925, -0.06798189133405685, -0.0006201657233759761, 0.03166960924863815, 0.0025441807229071856, 0.03436632454395294, -0.020243553444743156, 0.0021672844886779785, 0.010164204053580761, 0.008175142109394073, -0.06498733907938004, -0.008412339724600315, -0.038191694766283035, -0.00544627383351326, -0.01196867786347866, 0.009688454680144787, -0.019435102120041847, 0.0774189829826355, -0.04710289090871811, -0.018973316997289658, -0.0323178693652153, -0.022417033091187477, 0.026514200493693352, -0.0012652672594413161, -0.012755231000483036, 0.026539411395788193, -0.016730738803744316, -0.03221429884433746, -0.03031834401190281, 0.03064890205860138, -0.00827833078801632, -0.04636299982666969, -0.04409274086356163, -3.1786083809492993e-07, -0.031140746548771858, 0.03442680090665817, 0.008577430620789528, 0.003842894220724702, 0.03769104555249214, -0.025073101744055748, -0.048826396465301514, 0.04901166632771492, 0.04935592785477638, -0.009122985415160656, 0.05134632810950279, -0.04378079995512962, 0.03778235241770744, -0.005146013107150793, 0.04410061240196228, 0.022870438173413277, 0.025381729006767273, -0.00895349308848381, -0.01889839768409729, -0.004075868520885706, 0.00919483695179224, 0.00367836095392704, -0.01064225658774376, -0.0008922333945520222, 0.005787273868918419, 0.010112900286912918, -0.009371702559292316, -0.08127478510141373, -0.00020790440612472594, 0.012738212011754513, -0.022971300408244133, -0.04988085478544235, -0.043112222105264664, 0.004414522089064121, -0.03586793318390846, 0.04770402982831001, -0.016997884958982468, 0.04105355963110924, 0.021058134734630585, 0.05482718348503113, -0.018240125849843025, 0.033797506242990494, -0.0142332399263978, 0.01242251880466938, -0.013608139008283615, 0.018477654084563255, 0.01952444016933441, -0.0533284917473793, 0.05681045353412628, 0.003922647330909967, -0.03331361338496208, -0.0509319081902504, 0.0021759807132184505, -0.07231573015451431, 0.024132350459694862, -0.023186899721622467, 0.019144004210829735, -0.008616582490503788, 0.02687547169625759, 0.027584120631217957, 0.04379681497812271, -0.02574644796550274, 0.01259980071336031, 0.09022204577922821, -0.02060437761247158, -0.03563583269715309, -0.006897639948874712, 0.014002660289406776, 0.017634935677051544, 0.028756994754076004, 0.029499610885977745, 0.0463830903172493, -0.02449885755777359, -0.06291414797306061, -0.049077171832323074, -0.02135143242776394, 0.02646525576710701, 0.04578240588307381, -0.005862806458026171, -0.03215063363313675, 0.0030407763551920652, -0.010758422315120697, -0.010290583595633507, 0.02317485399544239, 0.007631399668753147, -0.038570016622543335, -0.005427136551588774, -0.04518400877714157, -0.012234384194016457, 0.009643820114433765, 0.02593832276761532, 0.022813986986875534, -0.039789214730262756, -0.040891747921705246, -0.011615634895861149, -0.03620956093072891, 0.001296409172937274, -0.020042939111590385, -0.05397959426045418, 0.008687137626111507, 0.03580191358923912, -0.03461932763457298, -0.03228489309549332, 0.021346960216760635, 0.019853917881846428, -0.007873166352510452, 0.0013645670842379332, 0.014947522431612015, 0.01498851552605629, 0.027462493628263474, 6.537726585520431e-05, -0.03034498170018196, 0.0026997511740773916, -0.02072499692440033, 0.03037250228226185, -0.043989766389131546, -0.028308311477303505, 0.02907032147049904, 0.0018611715640872717, -0.011147924698889256, 0.03601804003119469, -0.008285551331937313, -0.04343612864613533, -0.0023190281353890896, 0.030167285352945328, -0.0046984488144516945, -0.015231155790388584, -0.042644161731004715, -0.022069083526730537, -0.0012204418890178204, -0.018037252128124237, -0.049708522856235504, 0.00884786993265152, -0.048723384737968445, -0.033828169107437134, 0.04059727117419243, 0.00846573431044817, -0.025226719677448273, -0.0006588653195649385, 0.01767013408243656, 0.013550328090786934, 0.005875378381460905, -0.02755606733262539, 0.0028685086872428656, 0.0055671678856015205, 0.05288989096879959, 0.002786338096484542, 0.011704713106155396, -0.05445016920566559, 0.04906482249498367, 0.029993433505296707, -0.0010609030723571777, -0.05468842759728432, 0.021854443475604057, 0.017882559448480606, 0.04986065998673439, -0.08198913186788559, 0.027324479073286057, -0.0029179819393903017, 0.03234079107642174, -0.02074027620255947, -0.0005491028423421085, -0.02816574089229107, -0.03474278002977371, 0.07092531770467758, -0.025202810764312744, -0.0014516295632347465, 0.010104844346642494, -0.055702053010463715, -0.0028454517014324665, -0.014473466202616692, 0.024009767919778824, -0.029762500897049904, -0.038814201951026917, 0.007417554035782814, -0.005394774954766035, 0.021935097873210907, 0.02445456013083458, 0.00044718163553625345, 0.0007251976639963686, -0.019695639610290527, -0.021868359297513962, 9.935579146258533e-05, -0.03830120712518692, -0.03355366364121437, -0.018670978024601936, -0.05672980099916458, 0.02321361005306244, -0.004610550124198198, 0.06102452054619789, -0.038619447499513626, -0.029414257034659386, -0.023253343999385834, 0.0623246505856514, -0.02094912715256214, 0.008914719335734844, 0.019913826137781143, -0.014146669767796993, 0.023585299029946327, -0.029241392388939857, -0.015371236950159073, 0.07652347534894943, 0.031256020069122314, -0.022616777569055557, 0.03711768239736557, -0.015307042747735977, -0.004552371799945831, 0.025201234966516495, -0.03407277539372444, 0.0007222170243039727, 0.024411357939243317, -0.0017289927927777171, 0.01381227932870388, 0.08538075536489487, 0.023145489394664764, 0.020202508196234703, 0.05879039317369461, 0.04270470142364502, 0.05961151421070099, -0.04502652958035469, -0.033291783183813095, -0.010204363614320755, -0.03146468475461006, -0.011219746433198452, 0.005433801561594009, -0.005610180087387562, -0.013676077127456665, -0.014545616693794727, 0.054272156208753586, -0.009991266764700413, 0.018363282084465027, 0.028053410351276398, 0.004931783303618431, -0.04974064975976944, -0.07254650443792343, 0.00212119915522635, -0.030468035489320755, -0.030541840940713882, -0.03873363882303238, 0.014980868436396122, -0.027602829039096832, -0.0015531110111624002, -0.02611648663878441, -0.07665715366601944, 0.03843327984213829, 0.02988000214099884, 0.038939304649829865, 0.022966809570789337, -0.006102388724684715, -0.012584551237523556, -0.010450907051563263, 0.03052501380443573, 0.02369799092411995, 0.019688846543431282, 0.019036849960684776, -0.03237953409552574, 0.02658098377287388, -0.008133850991725922, -0.011671938933432102, 0.029618598520755768, 0.009453958831727505, 0.032548703253269196, -0.034176722168922424, -0.016558555886149406, -0.0026682810857892036, 0.02670804038643837, 0.05767473578453064, 0.012620396912097931, -0.020122310146689415, -0.043604183942079544, -0.008301204070448875, -0.027785683050751686, -0.012901580892503262, 0.027029309421777725, 0.03629679977893829, 0.03548884391784668, -0.032955531030893326, 0.021104997023940086, -0.04240449517965317, 0.026610247790813446, -0.026204997673630714, 0.0014536086237058043, 0.0044026500545442104, -0.014022729359567165, -0.03229153901338577, -0.010603921487927437, -0.017652278766036034, -0.0019927779212594032, -0.04984701797366142, -0.0360260084271431, 0.005037266295403242, -0.04908370226621628, 0.013484395109117031, -0.0264466293156147, 0.019589997828006744, -0.024231042712926865, -0.00906869675964117, -0.010864904150366783, -0.025246571749448776, -0.0003251359739806503, -0.016430051997303963, 0.01986798830330372, 0.03290141373872757, 0.05426553636789322, -0.07008344680070877, -0.021705325692892075, 0.04489165544509888, -0.004063697997480631, -0.020953159779310226, 0.005982239730656147, -0.038654305040836334, -0.0615711584687233, 0.005439154338091612, 0.02711702138185501, -0.008448636159300804, -0.016440702602267265, -0.02595207840204239, 0.022699205204844475, -0.0225283894687891, 0.06538804620504379, -0.0837266743183136, -0.05089440196752548, -0.03153529763221741, -0.014182159677147865, -0.024516398087143898, 0.05083746463060379, 0.0011483473936095834, -0.011152953840792179, 0.019425606355071068, -0.004413371905684471, -0.011585768312215805, -0.008473867550492287, 0.006571218837052584, 0.005964442156255245, 0.00413248548284173, -0.039289552718400955, 0.01673920825123787, 0.05932610109448433, 0.02931138314306736, 0.03729153424501419, -0.032178085297346115, 0.09054069966077805, 0.003541399724781513, -0.02040102891623974, 0.0335691012442112, 0.0023426006082445383, 0.00119302561506629, -0.00909470021724701, -0.058401864022016525, 0.0735490545630455, -0.02012190781533718, 0.012296575121581554, -0.0022191032767295837, -0.006940528750419617, 0.05098891258239746, 0.025116506963968277, -0.02006465382874012, 0.025356946513056755, 0.05129021778702736, 0.021223951131105423, -0.00527716102078557, -0.05134671926498413, 0.016074014827609062, 0.0034901625476777554, 0.07571224868297577, 0.06060843914747238, 0.019657311961054802, -0.006239632144570351, -0.02126443386077881, 0.017717167735099792, -0.028178948909044266, -0.05186495557427406, -0.031384702771902084, -0.027058260515332222, -0.1038145199418068, 0.030338777229189873, -0.06478016078472137, -0.06098339334130287, 0.021071525290608406, 0.02529301308095455, 0.06755194067955017, -0.0365900844335556, -0.0320032574236393, 0.021191315725445747, 0.012678784318268299, -0.02029627375304699, -0.006004428956657648, 0.04356564208865166, 0.008364992216229439, -0.031111782416701317, -0.14221756160259247, 0.013288375921547413, -0.029158709570765495, -0.03101993165910244, 0.014649789780378342, 0.02282245084643364, -0.002655118703842163, 0.04423387721180916, 0.040543366223573685, -0.01459558680653572, 0.011473331600427628, -0.09130794554948807, 0.005705492105334997, 0.007437976077198982, 0.003443444613367319, -0.012914036400616169, -0.02817201055586338, 0.04175341874361038, -0.026228178292512894, 0.05791374668478966, 0.011094283312559128, -0.020072586834430695, 0.018362529575824738, 0.03950252756476402, 0.014905636198818684, 0.05232280120253563, -0.013581621460616589, -0.026190543547272682, 0.021684039384126663, -0.04847168177366257, 0.017087098211050034, -0.03627798333764076, 0.010164485312998295, -0.05211503058671951, 0.029964562505483627, 0.0173325277864933, 0.05204574018716812, -0.07522927969694138, -0.04038763418793678, 0.024310925975441933, 0.04496125131845474, 0.05316838249564171, 0.0563988983631134, 0.04699460044503212, -0.014250917360186577, -0.019624482840299606, 0.0009378183167427778, 0.0310945026576519, -0.017893267795443535, -0.03482426702976227, -0.005534008610993624, 0.01584438793361187, -0.016579166054725647, -0.010253747925162315, -0.0047560748644173145, -0.00924754049628973, -0.030618930235505104, 0.008117228746414185, 0.019564907997846603, 0.01977541856467724, -0.028233814984560013, -0.07369294762611389, 0.002558659063652158, -0.009378047659993172, -0.003298813011497259, -0.0027088045608252287, -0.05723100155591965, 0.010283347219228745, -0.04216748848557472, -0.036464303731918335, -0.016274282708764076, 0.02037029154598713, -0.0324445441365242, -0.05168528109788895, -0.0017739551840350032, 0.027014052495360374, -0.018742723390460014, 0.042285509407520294, 0.08014560490846634, -0.024416930973529816, 0.01738288439810276, -0.011250934563577175, -0.03967079520225525, 0.00026214413810521364, -0.03529912978410721, 0.03928031399846077, 0.014210480265319347, 0.031595125794410706, -0.04646536335349083, 0.02043098211288452, 0.00370613532140851, 0.001798764686100185, -0.05635920539498329, -0.001384770148433745, 0.014804893173277378, -0.017672238871455193, 0.0002200805174652487, 0.009399833157658577, 0.059968188405036926, -0.014980429783463478, 0.015514206141233444, -0.057714421302080154, 0.00987586472183466, 0.029005253687500954, 0.0031361172441393137, 0.0012561642797663808, 0.10040706396102905, -0.026372281834483147, -0.046819206327199936, 0.02913898602128029, -0.007474694401025772, 0.04505518451333046, 0.008728785440325737, -0.011935564689338207, -0.07580941915512085, -0.03477557748556137, 0.00831272266805172, -0.055446967482566833, 0.04934819042682648, -0.00777488574385643, 0.04374036192893982, -0.0200048815459013, 0.034482166171073914, 0.01088374201208353, -0.060005661100149155, -0.018714187666773796, -0.039561059325933456, -0.014007232151925564, -0.011436201632022858, 0.008793581277132034, -0.004188893362879753, -0.023026449605822563, 0.000793477229308337, 0.006244379561394453, 0.029402894899249077, -0.03749292343854904, 0.002670788671821356, 0.06962326914072037, -0.04893488436937332, 0.009129672311246395, 0.03313245251774788, -0.007266621571034193, -0.012852302752435207, 0.003959826659411192, 0.019211044535040855, 0.0336722694337368, 0.024616405367851257, -0.027333522215485573, -0.04864329472184181, 0.07117068767547607, 0.006693963892757893, 0.04965028911828995, 0.004121767822653055, -0.014042000286281109, 0.004240228794515133, -0.0017321017803624272, 0.006880928296595812, 0.05296022817492485, 0.012016523629426956, 0.008241629227995872, -0.05940231680870056, 0.07657930999994278, -0.03656354174017906, -0.052963737398386, -0.04976903647184372, 0.027502812445163727, -0.007105723023414612, -0.03657204285264015, -0.03655019775032997, -0.0018084770999848843, -0.07224306464195251, 0.00965823046863079, -0.01107352040708065, 0.008183700032532215, 0.011940575204789639, 0.11663930863142014, 0.07610750198364258, -0.006856543943285942, -0.0034990787971764803, -0.01906534470617771, 0.030794847756624222, 0.009789197705686092, 0.016759170219302177, 0.025321831926703453, 0.002362719504162669, -0.025858130306005478, 0.01238313689827919, -0.0023597576655447483, -0.018748780712485313, -0.015227507799863815, -0.02255311794579029, -0.040607284754514694, -0.027340011671185493, -0.011350784450769424, -0.004592697601765394, -0.03670329228043556, 0.0441880002617836, -0.003862476907670498, -0.03565923497080803, 0.03560398891568184, 0.12003614753484726, -0.0494748130440712, 0.013182580471038818, 0.004166033118963242, 0.0397447906434536, 0.023967020213603973, -0.014209210872650146, 0.04398911073803902, -0.06685987859964371, 0.0017546623712405562, 0.022533763200044632, -0.05065310001373291, -0.09923339635133743, -0.004559351596981287, -0.0017455217894166708, -0.03800130635499954, 0.04672860726714134, -0.013196401298046112, 0.04957875609397888, 0.017210153862833977, -0.041720759123563766, 0.011581121012568474, 0.06164976954460144, 0.05495823919773102, 0.02367612160742283, 0.022868242114782333, -0.026328956708312035, 0.050871096551418304, 0.007852462120354176, 0.049375567585229874, -0.047566015273332596, 0.03424816206097603, 0.05096570774912834, 0.03547517955303192, 0.0010320600122213364, -0.01896742545068264, 0.037362851202487946, 0.012608657591044903, -0.03988957405090332, -0.050076212733983994, 0.043364617973566055, -0.020738592371344566, 0.08724326640367508, -0.019608398899435997, -0.00797134917229414, 0.009576485492289066, -0.01879779063165188, 0.007389089558273554, -0.010044394060969353, -0.0030236903112381697, 0.01298595406115055, 0.013595462776720524, 0.03613196685910225, -0.03293905407190323, 0.0045963795855641365, 0.02112484723329544, -0.038873910903930664, -0.0007362071773968637, 0.0021365138236433268, 0.01691417396068573, -0.04861413687467575, 0.08700256794691086, -0.029589150100946426, 0.06149933487176895, -0.013487807475030422, -0.0039791204035282135, 0.02002306655049324, 0.0960330218076706, 0.020007073879241943, -0.019597213715314865, -0.002821373287588358, -0.055465180426836014, -0.040571898221969604]\n"
     ]
    }
   ],
   "source": [
    "embeddings = [embed_chunk(chunk) for chunk in chunks]\n",
    "\n",
    "print(len(embeddings))\n",
    "print(embeddings[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "babfbd91-76fc-4467-9ff7-ccaf5ffbbd54",
   "metadata": {},
   "outputs": [],
   "source": [
    "import chromadb\n",
    "\n",
    "chromadb_client = chromadb.EphemeralClient()\n",
    "chromadb_collection = chromadb_client.get_or_create_collection(name=\"default\")\n",
    "\n",
    "def save_embeddings(chunks: List[str], embeddings: List[List[float]]) -> None:\n",
    "    for i, (chunk, embedding) in enumerate(zip(chunks, embeddings)):\n",
    "        chromadb_collection.add(\n",
    "            documents=[chunk],\n",
    "            embeddings=[embedding],\n",
    "            ids=[str(i)]\n",
    "        )\n",
    "\n",
    "save_embeddings(chunks, embeddings)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "9e47b06d-3f7a-40bd-886a-aca6c7e19f0b",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0] # 哆啦A梦与超级赛亚人：时空之战\n",
      "\n",
      "[1] 三件秘密道具分别是：可以临时赋予超级战力的“复制斗篷”，能暂停时间五秒的“时间停止手表”，以及可在一分钟中完成一年修行的“精神与时光屋便携版”。大雄被推进精神屋内，在其中接受密集的训练，虽然只有几分钟现实时间，他却经历了整整一年的苦修。刚开始他依旧软弱，想放弃、想逃跑，但当他想起静香、父母，还有哆啦A梦那坚定的眼神时，他终于咬牙坚持了下来。出来之后，他的身体与精神都焕然一新，眼神中多了一份成熟与自信。\n",
      "\n",
      "[2] 最终战在黑暗赛亚人的空中要塞前爆发，特兰克斯率先出击，释放全力与敌人正面对决。哆啦A梦则用任意门和道具支援，从各个方向制造混乱，尽量压制敌人的时空能力。但黑暗赛亚人太过强大，仅凭特兰克斯一人根本无法压制，更别说击败。就在特兰克斯即将被击倒之际，大雄披上复制斗篷、冲破恐惧从高空跃下。他的拳头燃烧着金色光焰，目标直指敌人心脏。\n",
      "\n",
      "[3] 战后，未来世界开始恢复，植物重新生长，人类重建家园。特兰克斯告别时紧紧握住大雄的手，说：“你是我见过最特别的战士。”哆啦A梦也为大雄感到骄傲，说他终于真正成长了一次。三人站在山丘上，看着远方重新明亮的地平线，心中感受到从未有过的安宁。随后，哆啦A梦与大雄乘坐时光机返回了属于他们的那个年代，一切仿佛又恢复平静。\n",
      "\n",
      "[4] 哆啦A梦与大雄听后大惊，但也从特兰克斯坚定的眼神中读出了不容拒绝的决心。特兰克斯解释说，未来的敌人并非普通反派，而是一个名叫“黑暗赛亚人”的存在，他由邪恶科学家复制了贝吉塔的基因并加以改造，实力超乎想象。这个敌人不仅拥有赛亚人战斗力，还能操纵扭曲的时间能量，几乎无人可敌。特兰克斯已经独自战斗多年，但每一次都以惨败告终。他说：“科技，是我那个时代唯一缺失的武器，而你们，正好拥有它。”\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def retrieve(query: str, top_k: int) -> List[str]:\n",
    "    query_embedding = embed_chunk(query)\n",
    "    results = chromadb_collection.query(\n",
    "        query_embeddings=[query_embedding],\n",
    "        n_results=top_k\n",
    "    )\n",
    "    return results['documents'][0]\n",
    "\n",
    "query = \"哆啦A梦使用的3个秘密道具分别是什么？\"\n",
    "retrieved_chunks = retrieve(query, 5)\n",
    "\n",
    "for i, chunk in enumerate(retrieved_chunks):\n",
    "    print(f\"[{i}] {chunk}\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "e57ac85d-d634-4c1d-93fa-e627cf09a6f1",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0] 三件秘密道具分别是：可以临时赋予超级战力的“复制斗篷”，能暂停时间五秒的“时间停止手表”，以及可在一分钟中完成一年修行的“精神与时光屋便携版”。大雄被推进精神屋内，在其中接受密集的训练，虽然只有几分钟现实时间，他却经历了整整一年的苦修。刚开始他依旧软弱，想放弃、想逃跑，但当他想起静香、父母，还有哆啦A梦那坚定的眼神时，他终于咬牙坚持了下来。出来之后，他的身体与精神都焕然一新，眼神中多了一份成熟与自信。\n",
      "\n",
      "[1] 最终战在黑暗赛亚人的空中要塞前爆发，特兰克斯率先出击，释放全力与敌人正面对决。哆啦A梦则用任意门和道具支援，从各个方向制造混乱，尽量压制敌人的时空能力。但黑暗赛亚人太过强大，仅凭特兰克斯一人根本无法压制，更别说击败。就在特兰克斯即将被击倒之际，大雄披上复制斗篷、冲破恐惧从高空跃下。他的拳头燃烧着金色光焰，目标直指敌人心脏。\n",
      "\n",
      "[2] 战后，未来世界开始恢复，植物重新生长，人类重建家园。特兰克斯告别时紧紧握住大雄的手，说：“你是我见过最特别的战士。”哆啦A梦也为大雄感到骄傲，说他终于真正成长了一次。三人站在山丘上，看着远方重新明亮的地平线，心中感受到从未有过的安宁。随后，哆啦A梦与大雄乘坐时光机返回了属于他们的那个年代，一切仿佛又恢复平静。\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from sentence_transformers import CrossEncoder\n",
    "\n",
    "def rerank(query: str, retrieved_chunks: List[str], top_k: int) -> List[str]:\n",
    "    cross_encoder = CrossEncoder('cross-encoder/mmarco-mMiniLMv2-L12-H384-v1')\n",
    "    pairs = [(query, chunk) for chunk in retrieved_chunks]\n",
    "    scores = cross_encoder.predict(pairs)\n",
    "\n",
    "    scored_chunks = list(zip(retrieved_chunks, scores))\n",
    "    scored_chunks.sort(key=lambda x: x[1], reverse=True)\n",
    "\n",
    "    return [chunk for chunk, _ in scored_chunks][:top_k]\n",
    "\n",
    "reranked_chunks = rerank(query, retrieved_chunks, 3)\n",
    "\n",
    "for i, chunk in enumerate(reranked_chunks):\n",
    "    print(f\"[{i}] {chunk}\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "79d844d8-846e-4a88-a19f-c8e282839b99",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "你是一位知识助手，请根据用户的问题和下列片段生成准确的回答。\n",
      "\n",
      "用户问题: 哆啦A梦使用的3个秘密道具分别是什么？\n",
      "\n",
      "相关片段:\n",
      "三件秘密道具分别是：可以临时赋予超级战力的“复制斗篷”，能暂停时间五秒的“时间停止手表”，以及可在一分钟中完成一年修行的“精神与时光屋便携版”。大雄被推进精神屋内，在其中接受密集的训练，虽然只有几分钟现实时间，他却经历了整整一年的苦修。刚开始他依旧软弱，想放弃、想逃跑，但当他想起静香、父母，还有哆啦A梦那坚定的眼神时，他终于咬牙坚持了下来。出来之后，他的身体与精神都焕然一新，眼神中多了一份成熟与自信。\n",
      "\n",
      "最终战在黑暗赛亚人的空中要塞前爆发，特兰克斯率先出击，释放全力与敌人正面对决。哆啦A梦则用任意门和道具支援，从各个方向制造混乱，尽量压制敌人的时空能力。但黑暗赛亚人太过强大，仅凭特兰克斯一人根本无法压制，更别说击败。就在特兰克斯即将被击倒之际，大雄披上复制斗篷、冲破恐惧从高空跃下。他的拳头燃烧着金色光焰，目标直指敌人心脏。\n",
      "\n",
      "战后，未来世界开始恢复，植物重新生长，人类重建家园。特兰克斯告别时紧紧握住大雄的手，说：“你是我见过最特别的战士。”哆啦A梦也为大雄感到骄傲，说他终于真正成长了一次。三人站在山丘上，看着远方重新明亮的地平线，心中感受到从未有过的安宁。随后，哆啦A梦与大雄乘坐时光机返回了属于他们的那个年代，一切仿佛又恢复平静。\n",
      "\n",
      "请基于上述内容作答，不要编造信息。\n",
      "\n",
      "---\n",
      "\n",
      "根据提供的片段，哆啦A梦使用的3个秘密道具分别是：\n",
      "\n",
      "1.  **复制斗篷**：可以临时赋予超级战力。\n",
      "2.  **时间停止手表**：能暂停时间五秒。\n",
      "3.  **精神与时光屋便携版**：可在一分钟中完成一年修行。\n"
     ]
    }
   ],
   "source": [
    "from dotenv import load_dotenv\n",
    "from google import genai\n",
    "\n",
    "load_dotenv()\n",
    "google_client = genai.Client()\n",
    "\n",
    "def generate(query: str, chunks: List[str]) -> str:\n",
    "    prompt = f\"\"\"你是一位知识助手，请根据用户的问题和下列片段生成准确的回答。\n",
    "\n",
    "用户问题: {query}\n",
    "\n",
    "相关片段:\n",
    "{\"\\n\\n\".join(chunks)}\n",
    "\n",
    "请基于上述内容作答，不要编造信息。\"\"\"\n",
    "\n",
    "    print(f\"{prompt}\\n\\n---\\n\")\n",
    "\n",
    "    response = google_client.models.generate_content(\n",
    "        model=\"gemini-2.5-flash\",\n",
    "        contents=prompt\n",
    "    )\n",
    "\n",
    "    return response.text\n",
    "\n",
    "answer = generate(query, reranked_chunks)\n",
    "print(answer)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

